Combining disparate data sources for improved poverty prediction and mapping
نویسندگان
چکیده
More than 330 million people are still living in extreme poverty in Africa. Timely, accurate, and spatially fine-grained baseline data are essential to determining policy in favor of reducing poverty. The potential of "Big Data" to estimate socioeconomic factors in Africa has been proven. However, most current studies are limited to using a single data source. We propose a computational framework to accurately predict the Global Multidimensional Poverty Index (MPI) at a finest spatial granularity and coverage of 552 communes in Senegal using environmental data (related to food security, economic activity, and accessibility to facilities) and call data records (capturing individualistic, spatial, and temporal aspects of people). Our framework is based on Gaussian Process regression, a Bayesian learning technique, providing uncertainty associated with predictions. We perform model selection using elastic net regularization to prevent overfitting. Our results empirically prove the superior accuracy when using disparate data (Pearson correlation of 0.91). Our approach is used to accurately predict important dimensions of poverty: health, education, and standard of living (Pearson correlation of 0.84-0.86). All predictions are validated using deprivations calculated from census. Our approach can be used to generate poverty maps frequently, and its diagnostic nature is, likely, to assist policy makers in designing better interventions for poverty eradication.
منابع مشابه
Hierarchical Alpha-cut Fuzzy C-means, Fuzzy ARTMAP and Cox Regression Model for Customer Churn Prediction
As customers are the main asset of any organization, customer churn management is becoming a major task for organizations to retain their valuable customers. In the previous studies, the applicability and efficiency of hierarchical data mining techniques for churn prediction by combining two or more techniques have been proved to provide better performances than many single techniques over a nu...
متن کاملUsing geographic information systems to understand health care access.
BACKGROUND Determining a community's health care access needs and testing interventions to improve access are difficult. This challenge is compounded by the task of translating the relevant data into a format that is clear and persuasive to policymakers and funding agencies. Geographic information systems can analyze and transform complex data from various sources into maps that illustrate prob...
متن کاملنقشهبرداری رقومی افقهای مشخصه و گروههای بزرگ خاک در منطقه زرند کرمان
Digital soil mapping includes soils, spatial prediction and their properties based on the relationship with covariates. This study was designed for digital soil mapping using binary logistic regression and boosted regression tree in Zarand region of Kerman. A stratified sampling scheme was adopted for the 90,000 ha area based on which, 123 soil profiles were described. In both approaches, the o...
متن کاملMapping poverty using mobile phone and satellite data
Poverty is one of the most important determinants of adverse health outcomes globally, a major cause of societal instability and one of the largest causes of lost human potential. Traditional approaches to measuring and targeting poverty rely heavily on census data, which in most low- and middle-income countries (LMICs) are unavailable or out-of-date. Alternate measures are needed to complement...
متن کاملA Bayesian framework for combining heterogeneous data sources for gene function prediction (in Saccharomyces cerevisiae).
Genomic sequencing is no longer a novelty, but gene function annotation remains a key challenge in modern biology. A variety of functional genomics experimental techniques are available, from classic methods such as affinity precipitation to advanced high-throughput techniques such as gene expression microarrays. In the future, more disparate methods will be developed, further increasing the ne...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره 114 شماره
صفحات -
تاریخ انتشار 2017